Churn and Default
Introduction
to-do: add intro
Literature Review
to-do: to be added/folded into the Introduction to frame the macro-scope of this study
note
Analysis is awaiting Experian appending.
This analysis summaries the Nautilus customer data from the Frog Hollow and Howell community solar farm participation from January 2020 to April 2022. The goal of the analysis is to first describe the characteristics of the residents, the prevalence of default and/or churn rates, the prevalence of churn rates, and any statistically significant differences between groups in their default, late payments or churn rates.
Methods
The data is comprised of monthly payment performance and, where available, demographic data for residents. 32,385 monthly observations over 813 homes, 812 accounts and 621 users were initially observed. To avoid data quality issues and to analyze any potential churn and default trends, for Frog Hollow households the analysis will drop values in October 2021 until April 2022. This narrowed dataset is comprised of 31,704 observations; the homes, accounts and users did not change.
The first ten rows of the data are presented, grouped by account number. Unique identifiers were replaced with sequential IDs.
- Tenure: length in months each account was active
- Payment_Method: either card or ACH/direct debit method
- temp_solar: annual (right?) solar allocation in kWh
- Churn: binary _var_iable capturing if an account left the solar farm
- defauted: binary variable capturing if account holder defaulted on payment
- Income: Low (<$50,000) or High (>$50,000)
- Solar_Farm: Either Howell or Frog Hollow
| Account_ID | tenure | payment_method | temp_solar | Churn | solar_farm | defaulted | count_user_id | Income |
|---|---|---|---|---|---|---|---|---|
| 1 | 17 | card | 25.68 | 0 | Howell (O&R) | 0 | 1 | NA |
| 2 | 22 | card | 6.65 | 0 | Howell (O&R) | 0 | 1 | NA |
| 3 | 18 | card | 0.96 | 1 | Howell (O&R) | 0 | 1 | NA |
| 4 | 24 | card | 83.40 | 0 | Howell (O&R) | 0 | 1 | NA |
| 5 | 25 | card | 6.05 | 0 | Howell (O&R) | 0 | 1 | High |
| 6 | 23 | ach | 14.54 | 0 | Howell (O&R) | 0 | 1 | NA |
| 7 | 23 | ach | 30.65 | 0 | Howell (O&R) | 0 | 1 | NA |
| 8 | 3 | NA | 24.87 | 0 | Frog Hollow (CHGE) | 0 | 1 | High |
| 9 | 25 | card | 7.92 | 0 | Howell (O&R) | 0 | 1 | NA |
| 10 | 23 | card | 11.20 | 0 | Howell (O&R) | 0 | 1 | High |
Descriptive Statistics
Total
The summary statistics below provide additional information. The average tenure of the 812 accounts was 20.3 months. The churn rate was 12.4%. The sample does skew towards high income homes, however note a large number of missing values were observed in income (567, 70%).
| Total | Mean | Max | Min | |
|---|---|---|---|---|
| Tenure | 16486.00 | 20.30 | 25.0 | 1 |
| Defaults | 6.00 | 0.01 | 1.0 | 0 |
| Churn | 118.00 | 0.15 | 1.0 | 0 |
| Low Income | 21.00 | 0.09 | NA | NA |
| High Income | 224.00 | 0.91 | NA | NA |
| Solar | 6294.78 | 7.75 | 83.4 | 0 |
| Payment by Card | 565.00 | 0.70 | NA | NA |
| Payment by ACH | 239.00 | 0.30 | NA | NA |
Frog Hollow
| Total | Mean | Max | Min | |
|---|---|---|---|---|
| Tenure | 8799.00 | 19.38 | 22.00 | 2 |
| Defaults | 1.00 | 0.00 | 1.00 | 0 |
| Churn | 59.00 | 0.13 | 1.00 | 0 |
| Low Income | 9.00 | 0.21 | NA | NA |
| High Income | 33.00 | 0.79 | NA | NA |
| Solar | 3269.62 | 7.20 | 48.14 | 0 |
| Payment by Card | 322.00 | 0.72 | NA | NA |
| Payment by ACH | 125.00 | 0.28 | NA | NA |
Howell
| Total | Mean | Max | Min | |
|---|---|---|---|---|
| Tenure | 7687.00 | 21.47 | 25.0 | 1 |
| Defaults | 5.00 | 0.01 | 1.0 | 0 |
| Churn | 59.00 | 0.16 | 1.0 | 0 |
| Low Income | 12.00 | 0.06 | NA | NA |
| High Income | 191.00 | 0.94 | NA | NA |
| Solar | 3025.16 | 8.45 | 83.4 | 0 |
| Payment by Card | 243.00 | 0.68 | NA | NA |
| Payment by ACH | 114.00 | 0.32 | NA | NA |
Tenure Length
Obersvations are next grouped by tenure length.
| Number of homes | Churned | Defaulted | Average Solar Allocation | Payment by Card | Payment by ACH | |
|---|---|---|---|---|---|---|
| ≤ 5 | 29 | 12 | 0 | 5.01 | 18 | 3 |
| 5 to 10 | 42 | 24 | 0 | 7.27 | 35 | 7 |
| 10 to 15 | 23 | 16 | 3 | 7.66 | 19 | 4 |
| 15 to 20 | 718 | 66 | 3 | 7.89 | 493 | 225 |
| Total | 812 | 118 | 6 | 7.75 | 565 | 239 |
For homes that churned, the tenure was reasonbly much shorter than those that did not churn. The below graph shows the average tenure between these two groups, showing that it is almost double for those homes that exhibit continuous subscription.
Income
| Income | Number of accounts | Churned | Defaulted | Average Solar Allocation | Payment by Card | Payment by ACH |
|---|---|---|---|---|---|---|
| High | 224 | 54 | 3 | 6.56 | 183 | 37 |
| Low | 21 | 9 | 1 | 4.54 | 16 | 4 |
| NA | 567 | 55 | 2 | 8.34 | 366 | 198 |
| Total | 812 | 118 | 6 | 7.75 | 565 | 239 |
Churn Reasons
When a respondent left the program, a reason was recorded, when available. The below summarises the accounts that experienced a churn, which may further be collapsed into categories. Note the total below may not match the churn totals in the above descriptive statistics due to defaulted payments and duplicates included.
temp <- raw %>%
group_by(user_id,utility_acct_number) %>%
summarise(tenure = length(unique(date_concat)),
count_accts = n(),
default = sum(default_tag=="Defaulted Payment"),
# count_accts = length(unique(utility_acct_number)) ,
count_churn = sum(default_tag !=0 & default_tag !="Defaulted Payment")/length(unique(date_concat))) #%>%## `summarise()` has grouped output by 'user_id'. You can override using the
## `.groups` argument.
# filter(count_accts > 1)
temp## # A tibble: 812 × 6
## # Groups: user_id [620]
## user_id utility_acct_number tenure count_accts default count_churn
## <dbl> <chr> <int> <int> <int> <dbl>
## 1 17 1769073011 23 23 0 0
## 2 19 7989248005 3 3 0 1
## 3 31 689005016 21 44 0 2.10
## 4 32 21002383038 22 23 0 0
## 5 33 2155441000 15 30 0 2
## 6 33 2536130080 17 34 0 2
## 7 37 21003282874 22 46 0 0
## 8 38 21001139837 22 69 0 0
## 9 39 7376061001 22 44 0 0
## 10 42 4381440011 15 15 0 1
## # … with 802 more rows
Results
Churn Figures
Solar
Tenure & Churn by Payment Method
Default Figures
Solar
Tenure & Default by Payment Method
Model
A logit model is deployed to analyze the probability of a home churning. The model in brief is described below. Without additional data collected from Experian, limited variability can be measured due to NA values in addition to limited number of coefficients to measure.
In the limited data available, statistically significant effects were measured for probability of churn for both tenure (Months) and days late. Due to relatively small number of observations of late payments, no statistically significant results were measured in probability of late or default payments.
\[\begin{equation} P_{d} = \beta_{0} + \beta_{1}*Months +\beta_{2}*(Payment Method = Card) +\beta_{3}*Income=Low \\ +\beta_{4}*Solar kwH + \beta{5}*(Solar Farm=Howell) \end{equation}\]
##
## Call:
## glm(formula = Churn ~ tenure + payment_method + Income + temp_solar +
## solar_farm, family = binomial(link = "logit"), data = joined)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.8845 -0.4531 -0.4198 0.2479 2.2907
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.26675 1.03903 5.069 0.000000400133 ***
## tenure -0.26409 0.04308 -6.131 0.000000000876 ***
## payment_methodcard -0.24492 0.51325 -0.477 0.633
## IncomeLow 0.55029 0.63145 0.871 0.383
## temp_solar -0.01833 0.04537 -0.404 0.686
## solar_farmHowell (O&R) -1.15128 0.50626 -2.274 0.023 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 276.31 on 239 degrees of freedom
## Residual deviance: 176.59 on 234 degrees of freedom
## (572 observations deleted due to missingness)
## AIC: 188.59
##
## Number of Fisher Scoring iterations: 5
The logit curves below show the relationship between the distribution of both tenure and days late with the probability of churning. The first graph shows that the longer the tenure, the lower the likelihood. The second graph shows that the days of late payments are generally not associated with likelihood of churn.